Cluster Selection in Divisive Clustering Algorithms

نویسندگان

  • Sergio M. Savaresi
  • Daniel L. Boley
  • Sergio Bittanti
  • Giovanna Gazzaniga
چکیده

The problem this paper focuses on is the classical problem of unsupervised clustering of a data-set. In particular, the bisecting divisive clustering approach is here considered. This approach consists in recursively splitting a cluster into two sub-clusters, starting from the main data-set. This is one of the more basic and common problems in fields like pattern analysis, data mining, document retrieval, image segmentation, decision making, etc. ([13], [15]). Note that by recursively using a bisecting divisive clustering procedure, the data-set can be partitioned into any given number of clusters. Interestingly enough, the so-obtained clusters are structured as a hierarchical binary tree (or a binary taxonomy). This is the reason why the bisecting divisive approach is very attractive in many applications (e.g. in document-retrieval/indexing problems – see e.g. [23]). Any divisive clustering algorithm can be divided into two sub-problems: • the problem of selecting which cluster must be split; • the problem of how splitting the selected cluster. This paper focuses on the first sub-problem. In particular, in this paper a new method for the selection of the cluster to split is proposed. This method is here presented with

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choosing the cluster to split in bisecting divisive clustering algorithms

This paper deals with the problem of clustering a data-set. In particular, the bisecting divisive approach is here considered. This approach can be naturally divided into two sub-problems: the problem of choosing which cluster must be divided, and the problem of splitting the selected cluster. The focus here is on the first problem. The contribution of this work is to propose a new simple techn...

متن کامل

Cluster merging and splitting in hierarchical clustering algorithms

Hierarchical clustering constructs a hierarchy of clusters by either repeatedly merging two smaller clusters into a larger one or splitting a larger cluster into smaller ones. The crucial step is how to best select the next cluster(s) to split or merge. Here we provide a comprehensive analysis of selection methods and propose several new methods. We perform extensive clustering experiments to t...

متن کامل

On the performance of bisecting K - means and PDDP * Sergio

problem is known as bisecting divisive clustering. Note that by recursively using a divisive bisecting clustering procedure, the dataset can be partitioned into any given number of clusters. Interestingly enough, the clusters so-obtained are structured as a hierarchical binary tree (or a binary taxonomy). This is the reason why the bisecting divisive approach is very attractive in many applicat...

متن کامل

A Multi-level Approach for Document Clustering

The divisive MinMaxCut algorithm of Ding et al. [3] produces more accurate clustering results than existing document cluster methods. Multilevel algorithms [4, 1, 5, 7] have been used to boost the speed of graph partitioning algorithms. We combine these two algorithms to construct faster and more accurate algorithm. In this new algorithm, the original graph is coarsened, partitioned by the divi...

متن کامل

Robust DNA Microarray Clustering Techniques for Oncological Diagnosis

Machine learning techniques are increasingly popular tools for understanding complex biological data. Prior research has demonstrated the power of simple statistical clustering algorithms for disease class discovery and prediction. In this work we examine the efficacy of spectral and divisive clustering on gene expression microarray data. In particular we consider simultaneous expression cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002